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Abstract 

We present a toolbox for extracting asymptotic information on the 
coefficients of combinatorial generating functions. This toolbox notably 
includes a treatment of the effect of Hadamard products on singularities 
in the context of the complex Tauberian technique known as singularity 
analysis. As a consequence, it becomes possible to unify the analysis of 
a number of divide-and-conquer algorithms, or equivalently random tree 
models, including several classical methods for sorting, searching, and 
dynamically managing equivalence relations. 
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This study was motivated by a desire to unify the analysis of a number of 
algorithms and data structures of computer science. By analysis we mean here 
(precise) average-case analysis of cost functions as introduced by Knuth and 
illustrated in the collection [41] as well as in his monumental series, The Art of 
Computer Programming (see especially [391, \40\). In the first part of this paper 
(Section [J and |2), we consider a major paradigm of algorithmic design, the 
"divide-and-conquer" principle, which is closely related to families of random 
trees and associated "tree recurrences". The basic framework is described in 
Section while lead examples are introduced in Section [2] below. Our treatment 
rests on combinatorial generating functions. 
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The central part of this paper (Sections \5\ and 3} is devoted to the process 
of extracting coefficients, at least asymptotically, from generating functions. 
Singularities have long been recognized to contain highly useful information in 
this regard, and we start by recalling in Section \S\ the basic principles of the 
complex Tauberian approach known as 11 singularity analysis" . Applications to 
algorithms and trees require, in particular, techniques for coping with generating 
functions that may be constructed by a tower of several transformations. Here, 
we develop the theory of composition of singularities under Hadamard products 
in Section \4\ (The reader only interested in complex-analytic aspects can jump 
directly to Sections \5\ and SI) 

The final part (Sections [5] and |6) returns to the original problem of analysing 
divide- and-conquer algorithms, taking full advantage of the analytic results of 
previous sections. Tree recurrences and first moments form the subject of Sec- 
tion \E\ where full asymptotic expansions are derived for expectations of costs. 
Section \6\ describes possible extensions of the basic framework to the determi- 
nation of variances and higher moments as well as to some other random tree 
models. 

1 Introduction 

"Divide-and-Conquer" is a major principle of algorithmic design in computer 
science. An instance (I) of a problem to be solved is first split into smaller 
subproblems (/', I") that are solved recursively by the same process; the partial 
solutions are then woven back to yield a solution to the original problem. The 
abstract scheme is then of the form: 



(Problems of size smaller than a certain threshold are treated directly without 
any recursive call.) Algorithms resorting to the scheme (pQ) include classical 
sorting methods (mergesort, quicksort, radix-exchange sort), data structures 
based on trees (binary search trees, digital trees known as "tries" , quadtrees for 
multidimensional search, union-find trees) as well as various methods used in 
computational geometry, distributed computation, and communication theory. 
We refer the reader to classical books on data structures, algorithms, and anal- 
ysis of algorithms for details, for instance, \^m\MM\MMM\\SS[\M\S2\- 
In general, a class of probabilistic models 9Jt„ indexed by the size n of the 
problem instance is assumed to reflect the nature of data fed to the algorithm. 
A cost function — typically, the number of certain operations performed by the 
algorithm — then becomes a random variable X n whose form is induced by 9Jl„ 
and the particular divide-and-conqucr algorithm considered. The problem is 
then to obtain characteristics of X n , for instance its mean, higher moments, 
or even distributional information. The asymptotic limit n — > oo is usually 



solve(J) 



(J', J") := split(J); 
J' := solve(J'); J" ■= 
return weave( J', J"). 



solve(/"); 



(1) 
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considered, since an important phenomenon of "asymptotic simplification" is to 
be expected in a large number of situations. 

Under natural conditions, a recurrence that closely mimics the recursive 
structure of (CD) relates the random variables X n ; 

X n = t n + Xk„ + X n - a -K n - (2) 

The interpretation is as follows: t n is a quantity 1 , called the "toll", that rep- 
resents the cost incurred by splitting the initial instance and weaving back the 
final solution; K n is the (random) size of the first subproblem, in which case, 
the second subproblem has a size that is the complement of K n to n — a, for 
some small constant a (usually, a = or a = 1), which is specific to the algo- 
rithm considered. The random variables of type X and K are assumed to be 
independent, as are the two X-sequences X and X on the right in |2J, and a sub- 
problem of size k is assumed to satisfy model VJlk — this property is sometimes 
called "randomness preservation" and is satisfied by many cases of algorithmic 
interest. A direct asymptotic treatment of the recursive relation {2} binding 
random variables is sometimes feasible; see the (metric) "contraction method" 
surveyed by Rosier and Riischendorf [54] and applied by Neininger [50] to a 
subset of the problems discussed here. 

Turning to average-case analysis, the expected cost /„ := E(X n ) satisfies a 
recurrence that is directly implied by |2): 

fn = t n + / lPn,k(fk + fn-a-k)- (3) 
k 

with the splitting probabilities p n .k '■= Pr(if n = k) being determined by the 
model dJl n used. Trees are naturally associated with recursive procedures, and, 
accordingly, the recurrence {3) can be viewed as associated with a random tree 
model of the following form: the root has size a, the left subtree has size k with 
probability p n ,k, and the right subtree has the remaining quantity n — a — k as 
size. Then (G2)is interpreted as giving the expectation of a cost function over 
the tree structure that is induced by the family of tolls, t n . For this reason, a 
recurrence having the form |3) is called a tree recurrence. Tree recurrences are 
the main object of study of this paper. 

One way to view the tree recurrence |3) is as a linear transformation on 
sequences 

(fn) = £ [(tn)] , (4) 

that takes a toll sequence (t n ) and returns the corresponding average-cost se- 
quence (f n )- The functional K, is fully determined by the splitting probabilities 
p n ,k- A classical approach to the derivation of explicit forms consists in intro- 
ducing generating functions (GFs). Fix a sequence of normalization constants 
u> n (that are problem-specific) and define the generating functions 

f(z) := fnU n z n , t(z) := ^ t n uj n z n . 

1 Some analyses require a randomly varying toll. For mean value analysis, the distinction 
between deterministic and stochastic tolls is, however, immaterial. 
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Then, the transformation K. induces another linear transformation £ on GFs: 

m=c[t(z)}. (5) 

With an adequate choice of the constants u) n , explicit forms of /„ can often be 
obtained, provided at least that toll sequences are of a simple enough form. 

Our main objective is to develop generating-function methods by which one 
can quantify the way the asymptotic form of (expected) costs relates to prop- 
erties of the toll sequence. It is known that asymptotic properties of number 
sequences (as the index n tends to infinity) are closely related to the nature 
of the singularities of the corresponding generating functions. This suggests 
that we examine the way the operator C operates on scales of singular functions 
and view it as a " singularity transformer" . Informally, there is a transforma- 
tion £, induced by C and acting on an asymptotic scale of functions singular at 
some fixed point zq. Using Sing(/(z)) to denote the expansion of f(z) at the 
singularity Zq, one has 

Sing(/(*)) = C[t(z)] (6) 

Under fairly general conditions, there is a tight coupling between singular ex- 
pansions of a generating function and the asymptotic form of its coefficients. 
The outcome of this process, justified by singularity analysis [241 151], is a direct 
relation written figuratively as 

Asympt((/„)) = Z [(*„)] , (7) 

where IC depends on (t n ) via the structure of its generating function t[z). 

The path we follow in this paper is the one given by (4}-(7}, which is then 
globally summarized by the following diagram: 

(t n ) -£♦(/„) (tn) Asympt((/„)) 



a- it (8) 

t(z)-^f(z) =► t(z)-^Sing(/(z)) 

We propose to develop a collection of generic tools that supplement the basic 
singularity analysis framework of Flajolet and Odlyzko [24]. In particular, we 
discuss in the next sections the action on singularities of differential and integral 
operators, as well as of Hadamard products. As a result, the way C operators 
associated with many recurrences transform singularities can be analyzed pre- 
cisely. This in turn yields a fairly general classification of the asymptotic growth 
phenomena associated to a variety of classical tree recurrences, including the 
ones of binary search trees, binary trees, and union-find trees, which will serve 
here as guiding examples. 

Most of the existing computer science literature is devoted to the "determin- 
istic" divide-and-conquer recurrences that correspond to a splitting size K n that 



Singularity Analysis and Tree Recurrences 



■5 



is deterministic, depending on n alone — typically, K n = \n/2\. In such a case, 
the probability distribution (pn,fc)fc=o is supported at a single point. The main 
asymptotic order of /„ is then given by what Cormen, Rivest, and Leiserson have 
termed "master theorems" : see [101 BE ES] . (Usually, the finer characteristics 
of the asymptotic regime involve fractal fluctuations [221 158].) What we con- 
sider here instead are methods for dealing with "stochastic" divide-and-conquer 
recurrences, where K n is a random variable (dependent on n) with support 
spread over a whole subinterval in (0,n). This stochastic case is discussed by 
Roura in [55J : Roura's arguments are based on elementary real analysis, so that 
they are of quite a wide scope, but his estimates are by nature mostly confined 
to first-order asymptotics. In this article we show that, in the many cases of 
practical interest where some strong complex-analytic structure is present, full 
asymptotic expansions can be derived. Our treatment is somewhat parallel in 
spirit to that of Knuth and Pittel whose inspiring work [42] provided one of the 
initial motivations 2 for the present study. An additional benefit of the complex- 
analytic approach is that it often gives access to variances and higher moments, 
in which case the limit distribution of costs can be identified. 

2 Some "special" tree recurrences 

In this section, we briefly review some tree recurrences that are of special interest 
in combinatorial mathematics and analysis of algorithms. 

2.1 The binary search tree recurrence 

One of the simplest model of random trees is defined as follows: To determine 
a tree T n of size n > 1, take a root and append to it a left subtree of size k and 
a right subtree of size n — k, where k is uniformly distributed over the set of 
permissible values {0, 1, . . . n — 1}; a tree of size is the empty tree. In earlier 
notations, this process corresponds to 



As is well known, the model defined by |9} corresponds to random trees defined 
by either the binary search tree data structure or the quicksort algorithm [40, 
47, 48, 58, 62]. The corresponding tree recurrence fl3) is then 




Pn :k = Pr(K n = k) := -, 



for k = 0, 1, . . .n — 1. 



(9) 





(10) 



with f :=t . 



2 See also Pittel's interesting recent article [53] which appeared while our own work was 
still in progress. 
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Ordinary GFs determined by the choice of coefficients u) n = 1 for all n are 
then 

n>0 k>0 

and standard rules for the manipulation of GFs translate ( [10] ) into a linear 
integral equation 

dw 



f(z)=t(z) + 2 / /(«;) . 

«/ 

Differentiation yields the ordinary differential equation 

/'(*) = + T^-JV), 
1 — z 

which is then solved by the variation-of-constants method: 

f(z)=£[t{z)], where C[t{z)] := (1 - z)~ 2 f (d w t(w)) (1 - w) 2 dw. (11) 

Jo 

In (111]) we have assumed without loss of generality the initial conditions to = 
/o = (thanks to linearity and the fact that the transform of t n = 5 n ,o is n + 1). 
The notation 9^, borrowed from differential algebra is used to denote derivatives 
whenever the operator nature of transformations is to be stressed. 

It is instructive to follow what Greene and Knuth call the "repertoire" ap- 
proach [31] . This consists in building a repertoire of the (K. or £) transforms of 
basic tolls, then trying to determine the effect of a new toll by expressing it in 
the basis of known tolls. What is convenient here is the class of tolls 

H : = (n + a\ = (a + l)(a + 2)---(a + n) ^ = _ x _ 



Then, by ( [IT]) one finds, for a ^ 1, 



/«(*) = ^±l[(l -*)-<*- l -{l-z)- 
a — 1 L 



Jn a-1 
while a = 1 leads to 



rt + oA /n+1 
a )~\ 1 



f 1 (*) = 77^2 '^r - ' /n = 2(n + 1) (H n+1 - 1) , 
(1 — z) z 1 — z 

i + • ■ ■ + — the nth harmonic number. 
Stirling's formula implies asymptotically, for a not a negative integer, 

n + a 



a J r(a + l)' 
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with r the Euler gamma function. Then what goes on is summarized by the 
following table: 




, a > 1 /„ 



q + l 



a-1 T(a + 1) 

/„-2nlogn (12) 
< a < 1 /„ ~ £±§n. 



The discontinuity in the asymptotic regime of / at a = 1, where a logarithm 
appears, is noticeable. Also, the tolls in the scale satisfying t n <C n are seen to 
induce costs that all collapse to linear functions. 

A full discussion of the binary search tree recurrence necessitates determin- 
ing the effect of toll functions like y/n, logn, and 1/n 2 , a task which is not 
entirely elementary. By the remarks of the introduction, this involves deter- 
mining the singularities of the corresponding generating functions and, in view 
of ( fll) , making explicit the way singular expansions get composed under differ- 
entiation and integration (Section [3) . This subject will then be taken up again 
in Section 5.1i the particular case of the toll t n = log n is of special importance 
and will be treated in detail there. 



2.2 The uniform binary tree recurrence 

This recurrence is of the form {n > 0, with the convention /o := to) 

fn=t n + J2 CkC "-l-* <J k + /w _ fc ) , with Cn :=^-( 2n ), (13) 
k=0 °" n + 1 \ n / 

a Catalan number. It corresponds to the uniform model of binary trees, where 
all the C n binary trees with n internal nodes are taken with equal likelihood. 
Indeed, the number of trees of size n satisfies the recurrence 

n-l 

C n = J2 C kC n -i-k (n>l), C = l, (14) 

k=0 

as seen from a root decomposition. The quantity p n ^ — CkC n -\~k/C n is then 
the probability that a tree of size n has left and right subtrees of respective sizes 
k and n — 1 — k. 

The GF of Catalan numbers satisfies a relation that is the image of the 
recurrence ( [141) , namely, C(z) = 1 + zC(z) 2 , so that 

C(z) = i (1 - Vl - 45) . (15) 

In order to solve |13| by generating functions, one should use as normalization 
constants the quantities w„ = C„, and introduce 

t(z) := J2 tnC n z n , f(z) := ^ f n C n z n . (16) 

n>0 n>0 
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Then ( [13] ) translates into a linear algebraic equation, 

f(z) = t(z) + 2zC(z)f(z), 
from which the form of the C operator immediately results: 

f(z)=C[t(z)}, where C[t(z)] = —L^t(z). (17) 

This form makes it possible to analyze directly only a restricted collection of 
tolls, for instance, ones of the form t r n := (n + l)n ■ ■ ■ (n — r + 2) (by differentia- 
tion), or t~ r = l/((n + 2)(rc. + 3) • • • (n + r + 1)) (by integration). However, tolls 
of such simple forms as i/n, H n , and logn, are left out of the scale of the i„ . 

Define the Hadamard product of two entire series or two functions analytic 
at the origin, a and b, as their termwise product, 

a(z)Gb(z) = J2a n b n z n , if a{z) = ^a n z n , b{z) = ^b n z n . (18) 

n>0 n>0 n>0 

Then, from ( [16) and (T7) , the cost functional is expressed by the modified 
transformation (of C type) 

f(z) = T( l^ -, where f(z) = ^ f n C n z n , r(z) := 2*n* n - (19) 

This now relates the ordinary generating function t(z) of the tolls and the nor- 
malized generating function /(z) of the costs (with the u„ = C n normalization) 
via a Hadamard product. 

Determining the way costs get transformed under this model then necessi- 
tates a way to combine singular expansions under Hadamard products. This 
is the central part of our article; see Section where a general theorem is 
stated. The "critical" value for tolls at which a discontinuity in the induced 
costs manifests itself is now at t n = \fn, and 3 



t n = n a , a> 1/2 


fi i 


= e{n a+ i) 


t n = n 1 ' 2 


fn 


= O(nlogn) 


t n = n a , < a < 1/2 


fn 


= 9(n). 



(20) 



This phenomenon observed in [281 Prop. 2] [of which ([20]) above corrects a few 
misprints] neatly distinguishes the binary Catalan model from the binary search 
tree model, as seen by comparing (20] ) to (TT2]). A proof accompanied by complete 
expansions will be given in the application section: see Section 5.21 below. 



3 The notation x = @(y) expresses the inequalities c\y < x < C2J/ for some constants ci, C2 
satisfying < c\ < C2 < +00. 
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2.3 The union— find tree recurrence 

By a result attributed to Cayley, there are U n = n n ~ 2 "free" unrooted trees 
(i.e., labeled connected acyclic graphs) on n nodes, and, accordingly, T n — n n ^ 1 
rooted trees. Consider the model in which initially each unrooted tree of size n 
is taken with equal likelihood. Choose an edge at random amongst any of the 
possible n — 1 edges of the tree, orient it in a random way, then cut it. This 
separates the tree into an ordered pair of smaller trees that are now rooted. 
Continue the process with each of the resulting subtrees, discarding the root. 
Assume that the cost incurred by selecting the edge and splitting the tree is 
t n . Then the total cost incurred when starting from a random unrooted tree 
and recursively splitting it till the completely disconnected graph is obtained 
satisfies the recurrence (n > 1) 



V- , s , fn\k k - 1 (n-k) n - k - 1 . . 

Tn = t n + 2_^Pn,k\Jk + Jn-k), where Vn,k = y k j 2(n-l)n n - 2 ' ' ' 

(Proof: There are n™^ 1 rooted trees on n nodes and the binomial coefficient 
takes care of relabellings.) The recurrence ( J2IT ) has been studied in great detail 
by Knuth and Pittel in [42], an article that largely motivated our study. In fact, 
there are good algorithmic reasons for considering the recurrence ( J2T) : if time 
is reversed, then the recursion describes the evolution of a random graph from 
totally disconnected to tree-like, when successive edges are added at random. 
The latter is exactly the probabilistic model involved in the "union-find" (or 
equivalence-finding) algorithm [10, 57, 62], for which detailed analyses had been 
provided by Knuth and Schonhagc [43] in 1978 4 . (Note that this model is not 
the same as the simply generated family of Cayley trees.) 

Let T(z), U (z) be the exponential generating functions of the sequences (T n ), 
(U n ), that is, 



Tiz)^^ 1 -' U{z) = ^n n - 2Z - 1 

— ' TV. z — ' TV. 



It is a well-known fact of combinatorics that T(z) satisfies the functional relation 
T(z) = ze T( * z \ and one has U = T - (T 2 /2); see ;,'{(), 39, 59]. Define now the 
generating functions 



7 

t(z) = Un^ 1 - n n ~ 2 )-, f(z) = £ jU- 1 -, 

TV. TV. 

n>l n>l 

where the normalization constants for f(z) are ui n = n n ~ l jn\ and, for conve- 
nience, a marginally different normalization, aj' n = rL n ^ 2 (n — l)/nl, has been 
introduced in the case of t(z). Then the recurrence (2T| ) has the form of a 



4 Precisely, the model is known as the "random spanning tree model". The derivation of 
our equation (22) closely mimics Section 11 of [43] . 
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binomial convolution, so that the cost GF f(z) satisfies 

/(*)- f f(w)—=t(z) + f(z)T(z). 
Jo w 

By differentiation, this last relation transforms into a linear differential equation 
of the first order, itself readily solved by the variation-of-constants method. 
Assuming (without loss of generality) the initial condition t\ = f\ = 0, the 
solution found is 



l-T{z)J T{w) 
In terms of the ordinary generating function of costs, namely, 

r(z) 

n>2 

equation ( j22| ) can be rephrased as an integral transform involving a Hadamard 
product, namely, 

The dominant singularity at z = e _1 of the Cayley tree function T is well 
known to be of the square root type. Then the integral transform (231 ) operates 
in a way that combines a Hadamard product and ordinary products, as well 
as integration and differentiation. This subject will be resumed in Section 15.31 
after general theorems have been established by which one can cope with such 
situations. The final conclusions turn out to be qualitatively similar to what 
was observed for the Catalan model in (|20|). 



3 Singular expansions, differentiation, and inte- 
gration 

Singularities of generating functions encode very precise information regard- 
ing the asymptotic behaviour of coefficients. In this section, we first recall in 
Subsection 13.11 the principles of a process by which this information can be ex- 
tracted: this is the singularity analysis framework of [241 EL]- We then prove 
that functions amenable to singularity analysis are closed under integration and 
differentiation; see Subsection 13.21 These operations have already been seen to 
intervene in the analysis of some of the major tree recurrences. 



3.1 Basics of singularity analysis 

Singularity analysis deals with functions that have isolated singularities on the 
boundary of their disc of convergence and are consequently continuable to wider 
areas of the complex plane. The case of a unique dominant singularity suffices 
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for the applications treated here. (In addition, the case of finitely many domi- 
nant singularities is easily reduced to this situation by using composite contours 
and cumulating contributions arising from individual singularities.) Given the 
obvious scaling rule, 

[z n ]f(z)=p- n f(pz), 

one may restrict attention, whenever necessary, to the case where the singularity 
is at 1. The scaling rule shows that the position of the singularity [at p for f(z)] 
introduces an exponential scaling factor (p~ n ) multiplied by the coefficient of a 
function singular at 1 [the function f{pz)]. 

Definition 1. A function defined by a Taylor series with radius of convergence 
equal to 1 is A-regular if it can be analytically continued in a domain 

A(</>, 77) := {z;\z\<l + r),\ Arg(z - 1)| > 0}, 

for some 77 > and < <j) < ir/2. A function f is said to admit a singular 
expansion at z = 1 if it is A-regular and 

f {z ) = J2c j (l-z)^+0(\l-z\ A ) (24) 
3=0 

uniformly in z £ A(0, 77), for a sequence of complex numbers (cj)o<j<j and an 
increasing sequence of real numbers (cxj)o<j<J satisfying otj < A. It is said to 
satisfy a singular expansion "with logarithmic terms" if, similarly, 

f(z) = ]T Cj (L{z)) (1 - zy* +0(\l- z\ A ), L(z) := log -L-, (25) 
3=0 

where each Cj(-) is a polynomial. 

Note that, by assumption, the O(-) error term in ( 124 1 ) must hold uniformly in 
z € A((/>, 77). We also allow in the usual way infinite asymptotic expansions rep- 
resenting an infinite collection of mutually compatible expansions of type (241 . 

For the sake of notational simplicity, we shall mostly limit our statements to 
the basic case (24) and briefly comment on how they extend to the logarithmic 
case (251 . The basic theorem is the following: 

Theorem 2 (Basic singularity analysis [24]). // f(z) admits a singular 
expansion of the form (24) valid in a A-domain, then 

[z"]/(z) = J2 ^ i n Z^~i) + O^- 1 ). (26) 

(The proof of this and similar results is based on an extensive use of Hankel 
contours; see the already cited references.) The last expansion can be rephrased 
as a standard asymptotic expansion since, for a ^ {0, 1,2.. .}, one has 

/n-a-f\ n-"- 1 ( a(a + 1) a(a + 1) (a + 2) (3a + 1) \ 
V.-a-l J ~ rFaO V + 2n + 24t^ +"'J> 
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while all the terms corresponding to a a nonnegative integer have an asymp- 
totically null contribution. When logarithmic terms are present in the singular 
expansion, corresponding logarithmic terms arise in the asymptotic expansion 
of coefficients. The calculations are conveniently carried out by differentiation 
with respect to the parameter a: 

8 r 8 r (n — a — 1 

[z n ](l ~ z) a L(z) r = (-l) r ^-[z"l(l - z) a = (-l) r TT— 
1 n ' v ; y ' da rl iy ' y ' da r \ -a - 1 



which yields for instance (a ^ {0, 1,2.. .}) 

[z n ](l-zrL(z) = 



d (n — a — l x 

da \ —a — 1 
n — a — 1\ / 1 I 



a — 1 /V —a 1 — a n — 1 — a 



logn — a) + O 



( logn 



r(— a) \ \ n 

(Here ip is the logarithmic derivative of T.) 

The same proof techniques also make it possible to translate error terms 
involving logarithmic terms; see [24] for details. In particular, the following 
transfer holds for A and B real numbers: 

0((1- z) A L B (z)) w Oin-^Xog 3 n). (27) 

Finally, we shall make use of a result which renders amenable to singular- 
ity analysis generating functions whose coefficients involve powers of n and its 
logarithms. 

Definition 3. The generalized polylogarithm Lio,, r7 where a is an arbitrary 
complex number and r a nonnegative integer is defined for \z\ < 1 by 



Lw(z) := ^(lognT — , 



n>l 

and the notation Li Q abbreviates Li a o- 

In particular, one has Liifi(z) = Lii(z) = L(z), the usual logarithm, cf. (25). 
The singular expansion of the polylogarithm, taken from [21], involves the Rie- 
mann £ function: 

Theorem 4 (Singularities of polylogarithms [21]). The function Li Q r (z) 
is A-continuable and, for a ^ {1, 2, . . it satisfies the singular expansion 



Li afl (z) ~ T(l - a)w a - 1 + —^C(a - j)w j , w = ~\ogz = Y J 



j>o ■' 1=1 



(28) 
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For r > 0, the singular expansion of Li a r is obtained by 

gr 

Lia.rO) = (-l) r -Q-pLi afi (z), 

and corresponding termwise differentiation of ( ]28j ) with respect to a. 

In particular, for a < 1, the main asymptotic term of Li a , r is 

r(l-a)(l-z) Q - 1 L r (^). 

Similar expansions hold when a is a positive integer; see [21J for details. 

Example 5. Stirling's formula?. The factorial function, is attainable via the 
form 

logn! = log 1 + log 2 H + logra = [z n ]- Li i(z), 

1 — z 

to which singularity analysis can be applied now that we have taken ordinary 
generating functions. Theorem [4] yields the singular expansion 

1 x • /x L{z)- 1 1 -L(z)+ 7 - l + log2^ 

^oa( z ~ ~T\ \2~ + o 1 1 ' 

1 — z (1 — z) 1 2 1 — z 

from which Stirling's formula can be read off, by Theorem [2J 
log n\ ~ n log n — n + — log n + log V2tt + ■ • • . 



[Stirling's constant log V27r conies out as — C'(0).] Similarly, the "superfactorial 
function" , 



■ n 



1!2! ••• nV 

satisfies 

togS(n) = [« n ]— !— LU,!^), 
1 — 2 

which gives rise to a second-order "Stirling's formula" , 

S(n) ~ n^" 2+ 5™+A e -^™ 2 yl, 

with 

A f 1 r>( i0\ f C(2) , log(2^)+ 7 
A:=e X p^--C(-l)J=exp^-^ + ^ 

(This last expansion, originally due to Glaisher, Jeffery, and Kinkelin, goes 
back to the 1860s and it can be established by Euler-Maclaurin summation; see 
Finch's book [20] for context and references.) The systematic character of the 
derivation given here clearly applies to many similar functions. □ 
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Methods of the last example may be used more generally to determine the 
Eulcr-Maclaurin constant relative to sums of the form ^(logrt) r /n s . The 
derivation by singularity analysis is quite systematic and several formula? of 
Ramanujan can be obtained in this way, for instance, 

lim (f-^n log k+1 N 
n^oo n k + l 

\n—l 

involving the Stieltjes constants Ah,. See Berndt's account of the problem in [3J 
p. 164] and references therein. 

3.2 Differentiation and integration 

In preparation for our later treatment of Hadamard products, we need a theorem 
that enables us to differentiate local expansions of analytic functions around a 
singularity. Such a result cannot of course be unconditionally true; see, for 
example, ([30]). However, it turns out that functions amenable to singularity 
analysis satisfy this property. The statement that follows is an adaptation suited 
to our needs of well-known differentiability properties of complex asymptotic 
expansions (see especially Theorem 1.4.2 of Olver's book [521 P- 9]). 

Theorem 6 (Singular differentiation). If f(z) is A-regular and admits a 
singular expansion near its singularity in the sense of ( j24j) , then for each integer 
r > 0, -j-rf(z) is also A-regular and admits an expansion obtained through term- 
by-term differentiation: 

= (-ir E c i r , r(Q f+ 1) a - z r- r +o(|i- z \ A - r ). 

az r 1 (<x, + 1 — r 

j=o J ' 

Proof. Clearly, all that is required is to establish the effect of differentiation on 
error terms, which is expressed symbolically as 

±0(\l-z\ A ) = 0(\l-z\ A - 1 ). 

By iteration, only the case of a single differentiation (r = 1) needs to be con- 
sidered. 

Let g{z) be a function that is regular in a domain A(0, rj) where it is assumed 
to satisfy g(z) = 0(|1 — z\ A ) for z € A. Choose a subdomain A' := A(0', rj'), 
where <f> < </)' < ^ and < rj' < rj. By elementary geometry, for any sufficiently 
small k > 0, the disc of radius n\z — 1| centered at a value z € A' lies entirely 
in A; see Figure U We fix such a small value k and let 7(2) represent the 
boundary of that disc oriented positively. 

The starting point is Cauchy's integral formula 

M = hSo g{w) w^> (29) 



Ah, with At 



(-l) fe d k+1 
k + 1 ds^ 1 



((s-l)C(V» s=1 , 



Singularity Analysis and Tree Recurrences 



15 




a direct consequence of the residue theorem. Here C should encircle z while 
lying inside the domain of regularity of g, and we opt for the choice C = "f{z). 
Then trivial bounds applied to ( [29] ) give: 

\g'(z)\ = 0(b(*)|-|l-*| A |l-*|- 2 ) 
= Ofll-zr 1 ). 

The estimate involves the length of the contour, |7(z)|, which is 0(|1 — z\) by 
construction, as well as the bound on g itself, which is 0(|1 — z\ A ) since all 
points of the contour are themselves at a distance exactly of the order of |1 — z\ 
from 1. □ 

For instance, taking 

g(z) = cos log [ — — — ) and g'(z) = — sinlog 



l-zj 1-z \l- 

we correctly predict that g(z) = 0(1) => g'( z ) — 0( 1 1 — On the other 

hand, the apparent paradox given by the pair 

a( z ) = cos and g , ( z ) = - (1 _ 1 z)2 sin (i~)' ( 3 °) 

is resolved by observing that in no nondcgcnerate sector around z — 1 do we 
have g(z) — O(l). 

It is also well known that integration of asymptotic expansions is usually 
easier than differentiation. Here is a statement custom-tailored to our needs. 

Theorem 7 (Singular integration). Let f(z) be /^.-regular and admit a A- 
expansion near its singularity in the sense of ([24]). Then J Q f(t)dt is also 
A-regular. Assume that none of the quantities atj and A equals —1. 



16 



J. A. Fill, P. Flajolet, and N. Kapur 



(i) If A < — 1, then the singular expansion of J f is 

f f{t) dt = - £ -^-r(l ~ + O (|1 - ■ 

Jo j=0 OCj + i 

(ii) If A > —1, i/ien the singular expansion of J f is 

f /(t) a = - £ -^rr(i - *) Q ' +1 + Lo + o (|i - 

where the "integration constant" Lq has the value 



(31) 



L o--= E ^T + / [/(*)- E *a-*r 



dt. 



Remark. The case where either some ay or A is —1 is easily treated by the 
additional rules 

(1 -ty x dt = Liz), f 0(\l-t\- 1 )dt = 0(L(z)). 



Similar rules consistent with elementary integration are applicable for powers 
of logarithms: they are derived from the easy identities (for a =/= —1) 

/ (l-t) a L r it)dt = i-l) r ^— il-t) a dt=i-l) r+1 -^—± '- , 

for r a positive integer. Furthermore, the corresponding O-transfers hold true. 
(The proofs are simple modifications of the one given below for the basic case.) 

Proof. The basic technique consists in integrating, term by term, the singular 
expansion of /. We let r(z) be the remainder term in the expansion of /, that 
is, 

J 

r{z) :=/(*)-£ci(l-*) '. 

3=0 

By assumption, throughout the A-domain one has, for some positive constant K, 

\r(z)\ <K\l~z\ A . 



ii) Case A < — 1 . By straight-line integration between and z, one finds (31 
as soon as it has been established that 



(* rit)dt = 0(\l- z\ A+1 ) 
Jo 
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A 



1 + V 



V 1\ 



Figure 2: The contour used in the proof of the integration theorem. 



By Cauchy's integral formula, we can choose any path of integration that stays 
within the region of analyticity of r. We choose the contour 7 := 71 U 72, shown 
in Figure \2\ Then 5 



r(t) dt 



< 



r(t) dt 



71 



r(t) dt 



72 



< X f \l-t\ A \dt\+K f \l~t\ A \ \dt\ 
= 0(\l-z\ A+1 ). 



Both integrals are 0(\1 — z\ A+1 ): for the integral along 71, this results from 
explicitly carrying out the integration; for the integral along 72, this results 
from the trivial bound 0(||72||(1 — z) A ). 

(ii) Case A > — 1. We let f-(z) represent the "divergence part" of / that 
gives rise to nonintegrability: 

/_(*):= <*(i-*r- 

Then with the decomposition / = [/ — /_] + /_, integrations can be performed 
separately. First, one finds 

5 The symbol \dt\ designates the differential line element (often denoted by ds) in the 
corresponding curvilinear integral. 
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Next, observe that the asymptotic condition guarantees the existence of f Q 
applied to [/ — /_], so that 

[f(t) - /_(t)] dt = j [f(t) - dt + J* [f(t) - dt. 

The first of these two integrals is a constant that contributes to Lq. As to the 
second integral, term-by-term integration yields 

[ Z [f(t)-f.(t)]dt = - J2 -4t(1^) Qj+1 + [ Z r(t)dt. 



aw>— 1 



The remainder integral is finite, given the growth condition on the remainder 
term, and, upon carrying out the integration along the rectilinear segment join- 
ing 1 to z, trivial bounds show that it is indeed 0(|1 — z| j4+1 ). □ 



4 Hadamard products and transformation of sin- 
gularities 

In this section we propose to examine the way singular expansions get composed 
under Hadamard products defined at (TT8|). The Hadamard product is a bilinear 
form. So if we have a set of functions admitting known singular expansions, we 
need to establish their composition law, and this will give composition rules for 
finite terminating expansions (Subsection |4.1|) . In order to extend this to asymp- 
totic expansions with error terms, we need to establish a theorem providing the 
shape of 

0(\l~z\ A )QO(\l-z\ B ). 

This is the more demanding part of the analysis, which is the subject of Sub- 
section [4721 Finally, in Subsection 14.31 we provide a summary statement, Theo- 
rem [111 to the effect that the class of functions amenable to singularity analysis 
is closed under Hadamard products and that the composition of singular expan- 
sions is effectively computable. 

4.1 Composition of singular elements 

The composition rule for polylogarithms is trivial, since 

l>i a , r (z) © Li/3, s (z) = hi a+f3 . r+s (z). 

However, polylogarithms do not have a simple composition rule with respect to 
ordinary products. We next turn to the composition rule for the basis formed 
by functions of the form (1 — z) a , where a may be any real number. From the 
expansion 

(l-z)° = l + ^ + ( - fl)( -° + 1) Z 2 + - (32) 
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around the origin, we get through term-by-term multiplication 

(1 - z) a (1 - z) b = 2 Fi[-a, -b; l;z], (33) 

Here %Fi represents the classical hypergeometric function of Gauss defined by 

r. r o i 1 a/3 z a{a + 1)0(0 + 1) z 2 
2 F 1 [a,0; T ,z] = l + ^-^+ ^ +--- 04) 

From the transformation theory of hypergeometrics, see e.g. [341 p. 163], we 
know that, in general, hypergeometric functions can be expanded in the vicinity 
of z = 1 by means of the z i— > 1 — z transformation. Instantiation of this 
transformation with 7=1 yields 

2 F 1 [a,0;l;z] = f^~)^7iZ^g) 2-Pi[a, /3;a + 0; 1 - z] 

+ r( " + ^~ 1) (l - z)-^ +1 2 F 1 [l-a,l-0;2-a-0;l-z}. (35) 
In other words, we can state the following proposition: 

Proposition 8. When a, b, and a + b are not integers, the Hadamard product 

(1 - z) a (1 - zf 
has an infinite A- expansion with exponent scale 

{0, 1, 2, . . .} U {a + b + 1, a + b + 2, . . .}, 

(i - *r o a -*) b - E ^ + E Mi- 6) (1 " z) fc a , +6+1+ft , 

fc>o ' k>a 

where the coefficients X and fi are given by 

Aa,b) _ T(l + a + b) (-ofi-jg 



r(l + a)r(l + 6) (- a -b) k 



(a , b) = r(-a-6-l) (l + a) fc (l + 6) fc 
^ r(-a)r(-6) (2 + a + fc)* ' 

Here x k is defined when k is a nonnegative integer as x(x + 1) • • • (x + k — 1). 
Remark. The case where either a or 6 is an integer poses no difficulty: one has 
- (1 — z) a g(z) is a polynomial if a = m, where m 6 Z>o; 
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- (1 — z) a g(z) is a derivative if a = —m where m £ Z>o, since 
(1 - z)~ m g(z) = j-^-yd™- 1 (z-^giz)) , 

and this case is covered by singular differentiation, Theorem [61 

Notice that Proposition \E\ remains valid in these two cases with the natural 
convention that l/T(—j) = when j G Z>o- 

The case where a + b £ Z needs transformation formulae that extend (35) and 
are found explicitly in the books by Abramowitz and Stegun [T| pp. 559-560] 
and by Whittaker and Watson [63, §14.53]. 

Remark. The case of expansions with logarithmic terms is covered by "differen- 
tiation under the integral sign" , as we now explain. Consider for instance the 
Hadamard product 

[(1 - z)- a L(z)] (1 - z)-P = ^- 2 Fi[a, /?; 1; z], 

where we assume for convenience that none of a, j3, a + [3 is an integer. For 
any fixed (3 and any fixed z, with, say, z S (0, 1), both sides of ( 1351 ) represent 
analytic functions of a. Thus, their derivatives with respect to a are identical 
as functions of a. This induces a transformation formula, originally valid in the 
stated z-range, which involves modified hypergeometric functions (these have 
additional i/>-factors in their coefficients) obtained from the fundamental 2-F1 
function by differentiation with respect to some of the parameters. The modi- 
fied functions then do exist in extended regions of the complex z-plane as shown 
by taking the classical Barnes representations in terms of contour integrals (see, 
e.g., [631 §14.5]) and then differentiating under the integral sign. The net effect 
of this discussion is that the fundamental transformation ( 1351 ) supports differ- 
entiation with respect to a, (3 and that the formally derived transformations 
provide analytically valid composition formulas for Hadamard products 

[{I - z)- a L k {z)]Q[{\ - z)-PL\z)] (36) 

of the base functions. 

In practice, for all the cases described above, one may often proceed as fol- 
lows: (i) take advantage of the a priori existence of a singular expansion of fQg, 
with f(z) = (I — z) a , g(z) = (1 — z) b or some of their derivatives, that is valid 
for z in a A-region (here the slit complex plane); (ii) compute an asymptotic 
expansion of the coefficients of / g by multiplication of the asymptotic ex- 
pansions of /„ and g n as obtained via singularity analysis; (Hi) reconstruct a 
singular function that matches asymptotically f n g n by using singularity analy- 
sis in the reverse direction. In Subsection 14.31 this process is formalized by the 
"Zigzag Algorithm" and illustrated by the return of Polya's drunkard. 

Globally, we are facing a situation where polylogarithms are simple for Hada- 
mard products and relatively complicated for ordinary products, with the dual 
situation occurring in the case of power functions. Each particular situation is 
likely to dictate whether calculations are best expressed in a basis of standard 
singular functions like {(1 — z) a L(z) k } or with polylogarithms, {Li aj k(z)}. 
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4.2 Composition of error terms 

We now examine how O(-) terms get composed under Hadamard products. The 
task is easier when the resulting function gets large at its singularity as shown by 
Proposition [9] Fortunately, thanks to the results of Section [3] regarding differen- 
tiation and integration, all cases can be reduced to this one: see Proposition QJJ 
below. 

The starting point is a general integral formula due to Hadamard for 
(/ &g)( z ), where 

f( z ) = and = 9nz n . 

n>0 n>0 

Assume that / and g are analytic in the unit disc and let z be a complex number 
satisfying \z\ < 1. Consider the integral 



I=±f /(»)*(-)-, 
Zwi J la \wJ w 



(37) 



taken (counterclockwise) along a contour 70 which is simply a circle of radius 
p centered at the origin such that \z\ < p < 1. In this way, both factors in 
the integrand are analytic functions of w along the contour. Evaluating the 
integral (371 ) by expanding the functions, we find 

n>0 

This is the classical formula of Hadamard for Hadamard products, 

(/©<?)(*) = ^ / /(«%(-)—, (38) 

valid, by analyticity, for any simple contour C such that each w S C satisfies 
\z\ < \w\ < 1. 

Proposition 9. Assume that f[z) and g{z) are A-regular in A(^ ;^) <znd that 

f(z) =0(\1- z\ a ) and g(z) = 0(\1 - z\ b ), z e A(^ ,ry), 

where a and b satisfy a + b + 1 < 0. Then the Hadamard product (/ g){z) 
is regular in a (possibly smaller) A-domain, call it A', where it admits the 
expansion 

(/ © 9){z) = 0{\1 - z\ a+b+1 ). (39) 



Proof. We first observe 6 that / g is continuable to certain points z such that 
\z\ > 1. (Precisely, as shown below, it admits a continuation in a A-domain.) 

6 This part of the argument is an adaptation to our needs of a famous result first due to 
Hadamard regarding the continuation of Hadamard products; see for instance the description 
in [4] Vol. II, p. 300] or |141 Sec. 88]. Accordingly, we limit ourselves to a succinct discussion 
only meant to set the stage for the precise estimates starting at {40). 
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Figure 3: The geometry of Hadamard domains: (left) boundary of a A-domain 
(1 + 1] = 1.5); (middle) boundary of A -1 ; (right) an allowable domain in 
A R zA" 1 for application of Hadamard's formula is the unshaded subset of A 
(|z| = 1.25). 

Indeed, because of the analytic continuation properties of / and g, both f(w) 
and g(z/w) are analytic functions of w in the domain A n (zA _1 ), where A 
denotes {w~ 1 : w S A}; see Figure [3] for a rendering. In other words, the 
allowed domain of values of w is A stripped of the internal domain (zA _1 ) c , 
where (-) c represents complementation. Fix then some z\ outside the unit disc 
but within A, and choose a simple contour 71 inside both A and ziA -1 . Let I(z) 
be the integral of ( [37] ) and ( [38] ) taken along this fixed contour 71 . (The feasibility 
of finding a suitable 71 is suggested by Figure [3] at least when \z\ \ remains close 
enough to 1 and Z\ is to the left of 1; a particular contour adapted to the case 
where z\ is close to 1 and possibly to its right will be constructed explicitly in 
the proof below.) Now, when z moves radially along the segment (0, Z\), the 
quantity I(z) defines an analytic function of z that does coincide with (f&g)(z) 
as soon as \z\ < 1 [this results from the "standard" formula ([38])]. Thus analytic 
continuation of / g, from within the unit disc to some z\ lying outside of the 
unit disc is granted. The argument shows at the same time that Hadamard's 
formula ( [38] ) remains a valid representation of / © g along such a contour 71 or 
any of its deformations legally granted by analyticity. 

We next turn to estimating the growth at its singularity of h := f g. It 
suffices to prove the estimate ( 139) on h for z belonging to a restricted domain 
A' := A (-01, 771), where we shall take 

m = cit), (| - ViJ = c i ~ ^0) ' ( 40 ) 

for some small positive constant c\. Notice also that it suffices to establish the 
estimate of (39) for 

\z - 1| < ryi = ar) (41) 
with z £ A', since h, being analytic in the rest of A', is certainly bounded there. 
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The main geometric objects from which the contour is built are as follows. 
First consider the circle centered at the origin 

Co := {w : \w\ = R] , R := 1 - c 2 r\ (42) 

for some small constant cn (independent of z). Set 6 = \z — l\, which is the main 
parameter governing the scaling of the contour 7. We also consider the circle 

C z := {w:\w-z\ = c 3 S} , (43) 

for some small positive C3. Finally, the contour 7 includes parts of the two 
tangents T, T' to the circle Co issuing from z; see Figure 31 The contour is then 
precisely specified as 

7 = 7o U 7t U 7 z U 7t' , 

where jt is the segment of T formed of points in between Co and z that are 
exterior to Co and C z , and similarly for "fx 1 - The component 70 is the part 
of the circle Co that lies on the "southwest" of and joins with T,T'; the 
component j z is the part of the circle C z that lies on the "northeast" of z and 
joins with T, J". The constants ci, C2, C3 are to be specified later and they can 
be taken as small as needed. 

The fundamental constraint to be satisfied is that 7 should lie entirely 
within A n (zA -1 ) when z stays within A': for w € 7, this ensures simultane- 
ously w S A and z/w € A, hence the validity of the Hadamard integral (1381). 
By a priori choosing Cj (which limits z) and C3 (which controls the radius of C 2 ) 
both small enough, the condition 7 C A is granted by elementary geometry. 
(E.g., the circle C z will not extend too much to the right of 5ft(w) = 1 and will 
therefore be "compatible" with the indentation of A at 1.) Next, one should 
have 7 n (zA _1 ) c = 0. This requires in particular choosing the radius R in (421 ) 
larger than z(l + rf)~ , which is at most (1 + ci7j)/(l + 77) since z has been 
restricted to \z — 1| < c\r\ by (14T1). This geometric condition expressed as 



(44) 
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Figure 5: Apex avoidance condition: The angle at z of contour 7 is constructed 
to be larger than the angle at z of the apex of (zA _1 ) c . 



is granted as soon as Ci,c% are both taken small enough. [E.g., it suffices that 
both ci, C2 be less than |(1 + rj)^ 1 .] We henceforth assume these smallness 
conditions on c 1; 02,03 to be satisfied. Finally, the contour should avoid the 
apex 7 of the domain (zA _1 ) c . Define the "viewing angle" of a point P exterior 
to a circle C as the angle betwen the two tangents to C issuing from P. For 
a circle of radius r and a point at distance d from the center, this angle is 
2 arcsin(r/d). In particular the point z itself views the circle Co of radius R 
under the angle 2 axcsisx(R/\z\), and this viewing angle is bounded from below 

by 

. ( R \ „ . /i-c 2 ?A 

2 arcsm = 2 arcsm , 

\l + Cif]J \l + cirij 

since the farthest z can get from the origin is by assumption 1 + c\r\. It then 
suffices to choose ci , c-i so that 

2 arcsin [ 1 ~ C2? M > 2^0 (45) 
\l + cirjj 

(e.g., decide C2 = c\, then decrease c\ = C2 until the inequality in (45) is 
satisfied) in order to ensure that the angle under which z views the circle Co 
exceeds 2ipQ. Since Cq encloses the inner disc of (zA _1 ) c with which it is con- 
centric, and since the angle at z of the apex of (zA _1 ) c is 2-0o, there results 
that the angle at z between and jt' encompasses the apex of (zA _1 ) c ; see 
Figure [5J In this way, the apex of (zA~ 1 ) c is avoided. 

Last, for A any of the four contours of which 7 is comprised, let /(A) be the 
integral of ( ]37l ) taken along contour A. The circular arc 7 Z has all its points at 
a distance C3S from z, so that there 

\l-w\=e(5), \z-w\ = e(6), f(w)g(£) =o(s a+b ). 

Therefore, by trivial bounds, 

^7.) = O (5 a+b+1 ) . (46) 

7 By the "apex" of (zA _1 ) c , we mean the complement in (zA _1 ) c of the largest circular 
disc centered at the origin which is contained in (zA _1 ) c . 
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On the other hand, along 70 the functions f(w) and g(z/w) stay away from 
their singularities, so that 

7( 7o ) = 0(1). (47) 

There remains only to estimate the contribution along the two connecting seg- 
ments 7t and jt' ■ The two situations are similar (upon interchanging the roles 
of a and b). It is then easily seen that the contribution along the ray stemming 
from z is bounded from above by a multiple of an integral of the form 

r+co 

/ t a \t-z \ b dt (48) 
J C36 

where zq is a complex number at a distance 0(<5) from the real line. (The 
quantity t parameterizes the tangent line T or T" .) The last integral is 0(i5 a+fa+1 ) 
as results from the change of variables t — St. Consequently, one finds 

7( 7T ) + 7( 7T/ ) = o (6 a+b+1 ) . (49) 

Putting together all the estimates of (46| ), (471 ), (491 yields the desired result. □ 

Remark. The proof technique of Proposition M tolerates the presence of loga- 
rithmic factors, in which case it suffices to develop the corresponding estimates 
for the basic integral (481 ). We find in this way, when a + b+1 < 0, the estimate 

O (|1 - z\ a \L k {z)\) O (|1 - z\ b \L l {z)\) =0(\1- z\ a+b+l \L k+ \z)\) . 

The contour 7 used in the proof is also susceptible to many variations. For 
instance, one may deform it slightly to include a "hook" near w = 1, in which 
case the modified contour may be used to estimate more finely the singular 
behaviour of Hadamard products. 

We can then extend the asymptotic range covered by Proposition \9\ as follows. 
Proposition 10. Assume that f(z) and g(z) are A-regular and that for z G A. 
f(z) = 0(\l-z\ a ) and g(z) = 0(\l-z\»). 

(i) Ifk<a + b+ l<k + l for some integer —1 < k < 00, then for z € A': 

(/ © 9)(z) = E ^T~(f - zY + 0(\l z\ a + b+1 ). 

3=0 J ' 

(ii) If a + b+1 is a nonnegative integer then for z £ A': 

(/ g)(z) = J2 ^(/ © <7) (j) (l)(l - *) j + Odl - z\ a + b +'\L(z)\). 
3=0 J ' 
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Proof. Let d — d z denote the operator 4- and let d denote the Euler operator 
zd. Observe that 

which yields 

tf fc+1 (/0sH(tf fc+ 7)0s- 

The differentiation properties of Theorem \6\ imply [with k := a+b+l in Case (ii)] 
that d k+1 f(z) is (9(|1 - z| a " fc " 1 ). Thus, Proposition M applies, to the effect that 

(^ +1 (/0ff)) (z) = 0(\l~z\ a+b - k ). 

On the other hand, the operator -d^ 1 is (for h in the image of 

f z dt 

(fi^h) (z) := P + / h(t) ^, 
Jo 1 

for some integration constant Pq. It is then possible to recover h = fOg through 
successive integrations, by making use of Theorem |7J 

Case (i). By definition of k, one has —1 < a + b — k < 0. Repeated 
integrations then show that 

(fQg)(z) = P(z) + 0(\l-z\ a + b + 1 ) : (50) 

for some polynomial P(z) of degree k that encapsulates the sequence of inte- 
gration constants. Equation ( [50] ) yields qualitatively the form of the statement. 
The polynomial P(z) is then automatically determined as the first (k+ 1) terms 
of the Taylor expansion of / g at 1, which is precisely what our assertion 
expresses. 

Case (ii). In this case, the first integration step requires integrating a 
term 0(\1 — which leads to the logarithmic form of the statement. (See 

also the comments following Theorem 21) □ 

4.3 Composition rules 

At this stage, we can summarize the state of affairs regarding Hadamard prod- 
ucts by the following general statement. 

Theorem 11 (Hadamard composition of singularities). Let f(z) and g(z) 
be two functions that are A-regular with expansions of the type ( [24]) : 

M N 

f(z) = c m (l-z) a ~+0(\l-z\ A ), g(z) = Y,d n (l-zf"+0(\l-z\ B ). 

m— n— 

Then, the Hadamard product (fQg)(z) is also A-regular. Its singular expansion 
is computable by bilinearity, using the composition rules of Proposition \8\ and 
the remarks thereafter, with error terms provided by Propositions and [T(k 

(/ © 9){Z) =J2 C mdn[(l ~ z)<*™ (1 - Z )^] + P(l - z) + O (|1 - z\ C ) , 

where C := 1 + min(ao + A-\- (5q) and P is a polynomial of degree less than C . 
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The polynomial P is accessible via the Taylor expansion of h — h s - mg , where 
/ising represents the sum of all the elements in the asymptotic expansion of h := 
/ g at z = 1 that are singular. This theorem then validates the following 
algorithm, which is often helpful in computations done by hand when composing 
functions under Hadamard products. 

"Zigzag" Algorithm. [Computes the singular expansion of / g up to 



1. Use singularity analysis to determine separately the asymptotic expan- 
sions Asympt(/„), Asympt(5„) of f n = [z n ]f(z) and g n = [z n ]g(z) into 
descending powers of n. 

2. Perform the resulting product and reorganize as Asympt(/„<7 n ). 

3. Choose a basis B of singular functions, for instance, the standard basis 
B = {(1 — z) a L(z) k }, or the polylogarithm basis, B = {Li / 3 J t(z)}. Con- 
struct a function H(z) expressed in terms of B whose singular behaviour 
is such that the asymptotic form of its coefficients, Asympt(_ff„), is com- 
patible with Asympt(/ n <? n ) up to the needed error terms. 

4. Output the singular expansion of / g as the quantity H(z) + P(z) + 
0(\l — z\ ), where P is a polynomial in (1 — z) of degree less than C. 

The reason for the addition of a polynomial in Step 4, is that integral powers 
of (1 — z) do not leave a trace in coefficient asymptotics since their contribution 
is asymptotically null. (An example of such "hidden" analytic terms already 
appears in the composition rule for powers given in Propositions!) The Zigzag 
Algorithm is then principally useful for determining the divergent part of ex- 
pansions. If needed, the coefficients in the polynomial P can be expressed as 
values of the function / g and its derivatives at 1 once it has been stripped 
of its nondifferentiable terms. (This is analogous to the situation prevailing in 
Proposition HOI ) 

Example 12. The return of Polya's drunkard. In the d-dimensional lattice 1 d 
of points with integer coordinates, the drunkard performs a random walk start- 
ing from the origin with steps in { — 1, + l} rf , each taken with equal likelihood. 
The probability that the drunkard is back at the origin after 2n steps is 



since the walk is a product d independent 1-dimensional walks. The probability 
that In is the epoch of the first return to the origin is the quantity pif , which 
is determined implicitly by 



o(|i^| c ). ] 




(51) 




(52) 



as results from the convolution equations expressing the decomposition of loops 
into primitive loops. In terms of the associated ordinary generating functions 
P and Q, this relation thus reads as (1 — P(z))^ 1 = Q(z). 
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The asymptotic analysis of the g ra 's is straightforward; the one of the p n 's is 
more involved and is of interest in connection with recurrence and transience of 
the random walk; see, e.g., pT5i S4] . The Hadamard closure theorem provides a 
direct access to this problem. Define 



Then, Equations ( T5IT) and ( [52] ) imply: 

P(z) = 1 - w * , where X(z) 0d := X(z) ••• X(z) (d times). 

The singularities of P(z) are found to be as follows. 
d = 1 : No Hadamard product is involved and 

„, . , . , . m 1 (In - 2> 

P(z) = 1 — v 1 — ^, implying pj, ; — 



n2 2n ~ 1 \ n - 1 

(This agrees with the classical combinatorial solution expressed in terms of 
Catalan numbers.) 

d = 2 : By the Hadamard closure theorem, the function Q(z) — X(z) X(z) 
admits a priori a singular expansion at z = 1 that is composed solely of elements 
of the form (1 — z) a possibly multiplied by integral powers of the logarithmic 
function L(z). From a computational standpoint (cf. the Zigzag Algorithm), it 
is then best to start from the coefficients themselves, 




and reconstruct the only singular expansion that is compatible, namely 
Q(z) = -L(z) + K + 0((1 - z) 1 -'), 

where e > is an arbitrarily small constant and K is fully determined as the 
limit as z — > 1 of Q{z) — tt^ 1 L(z). Then it can be seen that the function P is 
A-continuable. (Proof: Otherwise, there would be complex poles arising from 

(2) 

zeros of the function Q on the unit disc, and this would entail in pn ' the presence 
of terms oscillating around 0, a fact that contradicts the necessary positivity of 
probabilities.) The singular expansion of P(z) at z — 1 results immediately 
from that of Q(z): 

, . 7T TT 2 K 

so that, by the extension of Theorem \2\ to arbitrary powers of logarithms as 
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given in [241 IS] , one has 



(2) TT 7 + TTif 
Pn = — ; 2 -~ 

nlog n 



oo 




K = 1 , ^ 

n=l 

= 0.8825424006106063735858257 . 

(See the study by Louchard et al. [461 Sec. 4] for somewhat similar calculations.) 

d = 3: This case is easy since Q(z) remains finite at its singularity z = 1 
where it admits an expansion in powers of (1 - z) 1/2 , to the effect that 

3 

„(3) 



^3/2 ^3/2 gn 5/2 

The function Q(z) is a priori A-continuable and its singular expansion can be 
reconstructed from the form of coefficients: 



1 ( 1 



leading to 



Q(z) ~ Q(l)-_vT= l+0(\l-z\), 

z— >1 7T 



1 

0(1)7 7rQ2(l) 



P ( z ) = I 1 - TvTT ) - r7527TTVr^ + 0(|1 - z\). 



By singularity analysis, the last expansion gives 

« (3) = 1 - \-Q ( — 

Pn ^ 3 / 2 Q 2 (l) n 3 / 2 U 2 , 

0(1) = , = 1.3932039296856768591842463. 

r(|) 4 

A complete asymptotic expansion in powers n -3 / 2 , n~ 5 / 2 , . . . can be obtained by 
the same devices. In particular this improves the error term above to 0(n~ 5 / 2 ). 
The explicit form of Q(l) results from its expression as the generalized hypergeo- 
metric ai^Ifi h, \\ 1, 1; l]j which evaluates by Clausen's theorem and Kummer's 
identity to the square of a complete elliptic integral. (See the papers by Larry 
Glasser for context, for instance [29]; nowadays, Maple and Mathematica even 
provide this value automatically). 

Higher dimensions are treated similarly, with logarithmic terms surfacing in 
asymptotic expansions for all even dimensions. □ 

We observe that, without the developments of the present paper, the precise 
asymptotic structure of such sequences is not obvious. Methods of the last exam- 
ple may be used to provide a rigorous setting to certain asymptotic enumeration 
results stated by physicists, where back-and-forth equivalences between singu- 
lar expansions of functions and asymptotic expansions of coefficients are often 
used without much justification. See for instance the works of Guttmann and 
collaborators [SJ [32] and Chyzak's numeric-symbolic study [8] relative to special 
self-avoiding polygons. 
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5 Applications: first moments 

Thanks to the extended singularity analysis toolkit, we are now in a position to 
analyze the tree recurrences that were introduced in Section [21 For each of the 
three models, two types of tolls are to be considered: 



and we assume in both cases that to — 0. The corresponding ordinary gener- 
ating functions are the polylogarithm Li Q = Li Qj o and the specific Lio,i, whose 
singular expansions have been already recalled as Theorem 31 In each case, a 
linear transform C relates the generating function of costs, f(z), to a generating 
function of tolls, either t(z) (normalized) or r(z) ("raw"). Theorems on com- 
position of singularities make it possible to follow step by step the elementary 
operations of which £ is composed and determine the effect of the C transform 
on singularities in a systematic manner. Given that computations are "auto- 
matic" , we will mostly focus our discussion on main terms and on the global 
shape of singular expansions, leaving some of the details as exercises to the 
reader — or better, to a computer algebra engine. 

The net outcome in each of the three tree models under consideration is the 
following: for large tolls, the cost is driven by the toll itself; for small tolls, the 
cost is of linear growth and, in a sense, "freely" caused by the recursion itself, 
that is, driven by the cumulation of costs due to small subtrees; in between, 
there is a threshold value of the toll where a "resonance" takes place between 
the toll and the recursion, leading to the emergence of a logarithmic factor. Such 
facts parallel what is familiar in the context of inhomogeneous linear differential 
equations, where either the free regime or the forced regime dominates, with 
logarithmic terms being created precisely by resonances. 

5.1 The binary search tree recurrence 

For the binary search tree model, there is an integral transform C that relates the 
ordinary generating function of tolls, t(z), and the ordinary generating function 
of the induced costs, f(z): it is given by (JTTj) according to which f(z) = C[t(z)], 
where (with to = /o = 0) 



Consequently, the computation is entirely mechanical 8 and it only needs the 
theorems relating to integration, differentiation, and polylogarithms (Theo- 
rems [6] and [7} in conjunction with basic singularity analysis (Theorem [2). Our 

8 In the Maple system for symbolic computations about two dozen instructions suffice to 
implement calculations, once use is made of Bruno Salvy's package equivalent dedicated to 
the asymptotic analysis of coefficients of generating functions |56j . It suffices to program 
the polylogarithm expansions (Theorem [4), use the system capabilities for series expansions, 
differentiation, and integration (Theorems \S\ and \7), and conclude by an appeal to Salvy's 
program that implements the basic transfers of Theorem \2\ 



t° := n a (with a > 0), 



4 og = logn, 




(53) 
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derivation below constitutes an alternative to parallel results by Neininger [50] , 
Chern et al. \3\\7\, and Fill and Kapur [19], who employ elementary but per- 
haps less transparent methods (typically, the approximation of discrete sums by 
integrals). 

Theorem 13. Under the binary search tree model, the expected values of the 
costs induced by tolls of type i" (a > 0) and t l ° s admit full asymptotic expansions 
in descending powers of n and integral powers of log n. The main terms are 
summarized by the following table: 



Toll (i„) 



Cost (/„ 



(2<a) 



(1< a < 2) 



(0 < a < 1) 



a 



a 



1 



log ti 



a- 1 
2n log n 

K„n 



K' n 



+ 0{n a ~ 1 ) 

— 6n log ti + (10 — 67)71 + O(logn) 
+ K a n + 0(n a - 1 ) 



+ 2(7^1)77 + 2 logn + 2 7 +l + - 

\n 

+ ^±l n a +K a + o(l) 
a — 1 



Proof. For the case a a nonnegative integer, the integration can be carried out 
in finite terms since the generating function of tolls is rational. For instance, 
the case a = 1 corresponds to the well-known analysis of Quicksort and binary 
search tree algorithms [401 |47l [581 [62] . 

For t%, it suffices to examine the effect of the C transform on singular ele- 
ments of the form c(l — z)@; e.g., for the main term corresponding to t n = n a , we 
should take (3 = — a — 1. The C transformation reads as a succession of opera- 
tions, "differentiate, multiply by (1 — z) 2 , integrate, multiply by (1 — z)~ 2 " — all 
are covered by our previous theorems. The chain on any particular singular 
element starts as 

c(l-z)P c/3(l -zf- ix{ ^ f c0(l-zf +1 . 

At this stage, integration intervenes. Assume that (3+1 7^ — 1. (Otherwise, a 
logarithm appears.) According to Theorem[7J and ignoring integration constants 
for the moment, integration gives 



c(3(l - zf +1 



I 



(i-z^+ 2x(i ^r 2 _ c JL(i _*)/». 



(3 + 2" ' (3 + 2 

Then this singular element corresponds to a contribution 

(3 fn-P-1 



(3 + 2\ -(3-1 /' 
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which is of order 0(n _/3_1 ). (The treatment of logarithmic terms is entirely 
similar.) 

The derivation above has left aside the determination of the integration 
constants. These are given by the second case of Theorem \7\ which provides 
in particular access to the constants K a and K' . The constant term in the 
asymptotic expansion of the integral is of the form 

K[t] := [ \t'{w)(l - wf - (t'(w)(l - w) 2 ) 



dw, 

where /_ represents the sum of the singular terms in / having exponent < — 1, 
as in the proof of Theorem [71 In the singular expansion of f(z), this integration 
constant gets further multiplied by (1 — z)~ 2 ; the resulting linear term in the 
asymptotic expansion of /„ is then plainly 

K[t] • (n + 1). 

In particular, if the growth of t n is smaller than n, then, the divergence part is 
absent and K[t] reduces to 



K[t]= t'(w){l~w) 2 dw = 2Y] 
Jo i 



^ (n + l)(n + 2)' 

as follows from expanding the integrand around and integrating the resulting 
series. This yields the following values for a < 1: 

K a = 2Y 7 !1 -, K'=2Y- ^ -, (54) 

£<(n+l)(n + 2)' ^(n + l)(n + 2)' 1 > 

while for 1 < a < 2, 

i<*-r(a + l){ n+ a a ) 



(n + l)(n + 2) 



n=l 

The theorem is finally established. □ 

Remark. The slowly convergent series expressions of K a , K' can be rephrased 
as definite integrals, thanks to Mellin transform techniques. The starting point 
is the easy formal identity, 

J2 c " n ~ S = fTT I™ f £ c « e ~"* J dx - ( 55 ) 

n>l \n>l J 

The constant K a with a < 1 corresponds to s = I— a and c„ = n/[(n+l)(n+2)], 
for which the integrand admits of closed form since 

E i i5 ^ = - [( 2 - z ) L ( z ) - 2z \ ■ 

^— ' (n + 1 ){n + 2) 2^ 

n— 1 v / v 7 
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From there, the constant K' is attained as ^-K"a| _.. A final change of vari- 
ables x = — log t then yields an integral representation for "Fill's first logarith- 
mic constant" (7 is Euler's constant): 

K' = -7-2j£ [(t-2)log(l-t)-2t] (log log ^ ^ (56) 
= 1.20356491674961033428628333814873131775552838577096. 

The last estimate to 50D improves on the earlier 3D evaluation of Fill \T6\ . The 
cost induced by t log is of particular interest as it is precisely the entropy of the 
distribution of binary search trees; see the account and first estimates in the 
book by Cover and Thomas pJJ, p. 72-74] , as well as pointers to self-organizing 
search in Fill's article [16]. In his doctoral dissertation [381 Section 5.1], Kapur 
has extended the methods and estimates to m-ary search trees. 



5.2 The uniform binary tree recurrence 

This section examines the uniform binary tree model that surfaces recurrently 
in combinatorics. Here, we put on a firm basis a classification of the expected 
costs corresponding the tolls i" and t l ° s which was outlined (with several ty- 
pographical errors) in an article by Flajolet and Steyaert [28]. The particular 
case of the toll t n = n has, like for binary search trees, a dignified history as 
it corresponds to path length in Catalan trees and to area under Dyck paths, 
whose first distributional analyses go back to Louchard and Takacs [451 E] ■ 

Our starting point is (1190 according to which the generating function of 
costs f(z) = C n fnZ n normalized by the Catalan numbers C n and the ordinary 
generating function of costs t(z) = tnZ n are related by f(z) = £[t(z)], where 

L[t{z)] = -j=L=(t{z)qC{z)), (57) 

with 

n>0 V 7 

We state: 



Theorem 14. Under the uniform binary tree model, the expected values of 
the costs induced by tolls of type t" (a > 0) and t^° s admit full asymptotic 
expansions in descending powers of n and integral powers o/logn. The main 
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terms are summarized by the following table: 



Toll (t n ) 


Cost (/„) 


n a 


(§<«) 


T(a) 


+ 0{n a -?) 


n 3/2 




1 2 

r(3/2) n 


+ 0(n logn) 


n a 


(I<a<|) 


r(a) 


+ 0(n) 


n l/2 




^=nlogn 


+ 0(n) 


n a 


(0 < a < i) 


K a n 


+ 0(1) 


logn 




K' n 


-2V^n 1/2 + 0(l). 



Proof. For the tolls , all that is required is to determine the singular expansion 
of 

^)0^( i ) = E^ T i)Uj( 

n— 1 v / \ / 

(For convenience, the singularity has been scaled to 1.) We use the Zigzag 
Algorithm presented in Subsection 14.31 The known asymptotic expansion of the 
Catalan numbers is 



1 



-3/2 



1 



9 145 
8n~ + 128n 2 



Multiply this by n a to get the expansion of n a C n 4~ n . The terms now in- 
volve the scale {n Q_ 2 : n a ~i : . . .}. Assume that a is not a half-integer [i.e., 
a ^ (iZ) \ Z]; see below for the contrary case. Then the basis of functions 



B = {{1 



}, where k ranges over the integers, has the property that 



the coefficients of its generic element are 0(n 



a — k- 



); in particular, 



[z n ](l 



r(a-|) 



(2a- 1)(2q-3) 
8n 



We can thus find a singular function H(z) whose coefficients match asymptoti- 
cally those of t(z) C(z/i), which is of the form 



H(z) 



T(a 



(1 - z)- a +i (1 + ci(l - z) + c 2 (l - zf + ■ ■ •) 



for some effectively computable sequence (cj). The singular expansion of t(z)Q 
C(z/A) is then the sum of the expansion of H above and of a power series 
in (1 — z), call it P(z), that can be determined according to the principles of 
Section [4] 
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The singular expansion of f(z/4) is that of H(z) + P(z) divided by y/l — z, 
so that, by transfer, we get 



r («-i>w 1 + ! + 4 + ...W-> p{z) 



V / 7rr(a) \ n n 2 J y/1 — z 



for some sequence (cj), where the "hidden" analytic part P(z) arises from the 
"hidden" analytic component in t(z) C(z/4). After dividing by C„4~ n , one 
finds finally: 



/„~^=i^ ( l + ± + ...) +Rn , 

T{a) \ n J 



(58) 



where the "hidden" remainder term R n is of the form 

d\ 

R n ~ d-xn + d H 1 . 

n 

This last estimate provides all the entries in the table above, whenever a is not 
a half- integer, as it suffices to merge the two expansions of ([581). In addition, 
when < a < i, the series defining t(z/4) converges at the singularity 1. Thus, 
the dominant asymptotic term of /(z/4) is t(l/A)/y/l — z, that is, 



>(i) 



n—1 v 7 



When a is a half-integer, logarithmic terms appear due to the presence of 
inverse integral powers of n in the coefficients of t(z/4), but the derivation is 
otherwise similar. For instance at a = 5, one has 



i- n V^c n ~ 4= - + o ' 1 

\/7r n 



which shows that 



t(z/4) = + Po + 0((1 - z) 1 " 6 ), fT(*) = -^L(z), 

resulting in the stated estimate. 

Finally, when t n = logn, we have t(z) = Lio ! i(z) = 0(\1 — z|~ 1_e ) for any 
e > 0. Thus, by Proposition CLOTi) , 

1 00 c 

(r C)(z/A) =k' + O(\1- z\2~ e ), k' q := £(logn)^. 

71=1 

Singularity analysis and the estimate for C n yield /„ = K Q n + o(n2 +£ ^J . Car- 
rying higher-order terms, we get the mean of the shape functional, 

M „ = K' Q n-2V^n^ 2 + 0(l), (59) 

which agrees with the estimate in Theorem 3.1 of [16J and improves the remain- 
der estimate. □ 
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The Mellin technique of ( [55] ) is once more applicable to the determination 
of "Fill's second logarithmic constant" k' . It provides the value: 

k> v logfc ( 2k \ 

-"-I vT-tdlvr-tr ( loglog 7)^ 

= 2.0254384677765738877135187391417652470652930617658. 

The subject of costs on binary trees is considered in greater depth in [18] by 
applying the techniques developed in this paper. There, some higher-order 
estimates, asymptotics for moments of each order, and limiting distributions 
are derived when the toll sequence is either n a or logn. 

Our methods can also be used to treat more generally the case of all simple 
families of trees in the sense of Meir and Moon [49] , of which Catalan trees are 
a special case. This generalization is the subject of ongoing work. 



5.3 The union— find tree recurrence 

In this subsection, we revisit the Knuth-Pittel-Schonhage recurrence corre- 
sponding to the destruction of free labelled trees and dually to the management 
of equivalence relations [421 S3]- The main result of this section is essentially 
a rephrasing of the main results of Knuth and Pittel in [42], to which we add 
the possibility of determining complete asymptotic expansions. Like before, 
the starting point is the integral transform ( J231 (adjusted for the fact that 
*i = fi = 1 7^ 0), which relates the ordinary generating function of tolls r(z) 
to the normalized generating function of costs f(z) via f(z) = C[t(z)], where 

C[t(z)} = t lZ T'{z) + f Q d w (r(w) T 2 {w)) (60) 

There T(z) is the Cayley tree function whose singular expansion at the (unique) 
dominant singularity z = e _1 is well known: one has the shape 

T{z) ~ 1 - a/2(1 - ez) 1 ' 2 + ci(l - ez) + • • • (61) 

asz^ e _1 in any sector of angle < 2tt; see also [421 Eq. (3.16)]. (The paper by 
Corless et al. [9] is a definitive reference regarding the tree function.) As noted 
earlier, the case of union-find tree recurrences combines all the composition 
results developed in this paper. 

Theorem 15. Under the union-find tree recurrence model, the expected values 
of the costs induced by tolls of type t" (a > 0) and t l ° s admit full asymptotic 
expansions in descending powers of n and integral powers o/logn. The main 
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terms are summarized by the following table: 



Toll (i„) 


Cost (/„) 


n a 


(I < «) 




V2T{a) 




n 3/2 

n a 


(| < a < 


1) 


1 r 2 
V2T{a) 


+ 0(n log n) 
+ 0(n) 


n l/2 






, nlosn 


+ 0(n) 


n a 


(0 < a < 


1) 


(1 + \K a )n 


+ 0(n a+ ^) 


logn 








+ 0(Vn). 



Proof. We shall content ourselves with indicating the way full asymptotic expan- 
sions can be determined within the generating function framework. (Detailed 
computations are left as an exercise for the reader.) In what follows, we set 
Z = (1 — z) and let A denote an unspecified entire series in powers of Z , not 
necessarily the same at each occurrence. For instance, one may summarize 
diversely the expansion ( [61] ) of Tizje) as 

T(>/e) ~ 1 - V2Z 1/2 + ZA+ Z Z ' 2 A ~ A+ AZ 1 ' 2 , 

and so on. We shall also let N denote generically a series in descending powers 
of l/n. 

We consider first the case of the toll t™ and assume for simplicity that a 
is not a half-integer: a (|Z) \ Z. The polylogarithm expansions grant us 
a priori that the generating function t(z) lies in the class of functions amenable 
to singularity analysis, with 

t(z) ~ Z'^A + A. 

Therefore, the Hadamard product {t(z) ©T 2 (z/e)) is also amenable. The coeffi- 
cients of the latter function are of the form n a ~?N ', as follows from the fact that 
[z"]t(z) = n a and [z n ]T 2 (z/e) ~ n~ 3 / 2 A/" (by the singular expansion of T 2 ). 
Thus, converting back this information to the function, we find 

t{z) T 2 (z/e) ~ Z- a+ ^A + A, d z (r{z) T 2 (z/e)) ~ Z'^A + A. 

What we have done here is to apply the Zigzag Algorithm of Section \4\ and the 
differentiation theorem. Then multiplication by l/T(z/e) ~ A + Z 1 / 2 A shows 
that 

— L-d z [r(z) T 2 (z/e)] ~ Z'^A + Z~ a A + A + Z^A. 
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Integration of this last expansion corresponds to increasing all exponents by 1. 
Finally one should multiply by T(z/e)(l-T(z/e)y 1 which is of type Z^^A + A. 
This completes our handling of the second term on the right in ([60]) . Also, 

-T'(z/e)~ Z-^A + A. 
e 

The end result is then 

f(z/e) ~ Z- a A + Z- a+ ^-A + Z~*A + A. 

The dominant term is Z~ a when a > 5 whereas it is Z -1 / 2 when a < i. 

At the same time, it is a simple task to trace the coefficients of main terms. 
For a > h, the main term of f(z/e) is Z~ a , and one finds successively 

r(z)QT\z/e) ~ ^T(a - |)(1 - z)~ a+ K 

f(z/e) ~ £{£zi) (!_*)-«, 

where the last equation implies, via singularity analysis, an estimate of expected 
costs: 

r „ r ( Q -l) n a+i 

Jn V2r(a) 

For a < i, the main term is Z -1 / 2 and its coefficient is seen to arise from 
both terms on the right in (J60): we have 

where K a — K[n a ] and the functional K is 

r 1/e dm 
K[t}:= dw (r(w)QT 2 (w))-—. (62) 
Jo T ( w ) 

Error terms can be similarly traced: in the case of /"(z/e), it is of type Z~ a 
if < a < i, of type Z -1 / 2 if | < a < 1, and so on. The end results are 
summarized in the statement of the theorem. 

For half-integer a, a logarithmic term appears. For instance, in the case 
a = i, this fact is associated to the shape of the coefficients 

[z"](r(z) T 2 (z/e)) ~ — I + J_jV, 
I (—1/2) n n A 

resulting in a singular expansion with a logarithmic term: 

t(z) T 2 (z/e) = - ^| log Z + c + ZA + {Z log Z)A, 
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for some c. 

For the logarithmic toll [note that now t\ = 0, so that the first term in ( [60] ) 
does not contribute] we have t(z) = Li 0j i(z), the integral in ( [60]) is convergent 
and, in the same way as for the case a < 1/2, we get 

f( z ) = J^L{l-ez)-^ + 0{\l-ez\-% 

which implies 

/„ = ~K' n + O(n^), 

with k' = K[logn} and K defined at (62). □ 

It is of interest to compare our approach to that of Knuth and Pittel [42] . 
These authors use what is fundamentally a "repertoire approach" , based on the 
transforms of two types of tolls, the Dirac tolls 6 mn and another family related 
to "tree polynomials" . Their methods do not clearly appear to be extendible 
to the extraction of sublinear terms in asymptotic expansions. At the same 
time, their developments require appreciably more involved and perhaps less 
transparent computations. 



6 Perspectives 

In this concluding section, we discuss at a fairly informal and abstract level 
applications of the extended singularity analysis toolkit developed in the present 
paper in two further directions: the determination of higher-order moments for 
our basic models, and the treatment of tree recurrences which are more complex 
than the ones present in our lead examples. (Some of our examples below may 
accordingly involve nonbinary tree models.) 

6.1 Higher moments and limit distributions 

Let us return to the general framework of Section J] There, the random cost X n 
is related to costs Xj( n and X n - a -K n by the fundamental recursion (2). Raising 
both members of (2) to some integral power s yields 

X n= X K n + X n-a-K n + ( ) *n ^K„-^n-a-K n ) (63) 

where we have made use of the multinomial expansion and have isolated the two 

(s) 

sth powers. Take expectations with respect to the model 9Jt„ and set /i„ :— 
K(X^). The recursion on sth moments becomes, thanks to independence of 
the X and X sequences on the right in ([63]), 

k 
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where 



This calculation shows that the sequence of sth moments for any fixed s satisfies 
the same type of recurrence as the first moments, save for a more complicated 
toll (ri^) that involves moments of the smaller orders 0, 1, . . . , s — 1. Define the 
normalized generating functions 



li 



with the normalization sequence w„ = 1 for binary search trees and uj n = C' n 
for uniform binary trees. Then the relation ( [64] ) is solved in terms of generating 
functions by an /^-transform as 

M (S) (*)=£(V«), r^(z)=Y ( S V 0Sl (z)0Q(M (s2) W,M (s3) W), 



s 3 
s 2> s 3^ 



(65) 

where Q is f°r the binary search tree model and Catalan model, respectively, 



BST (a(z),b(z)) = / a(t)b(t)dt, Q Cat (a(z),b(z)) = za(z)b(z). (66) 



The C transform is given in (53) and (57) for the respective cases; the case 
of the union-find tree model [where Lu n = n n ~ x jn\ is used for fi^ and u' n = 
n n ~ 2 {n — l)/n\ is used for r^} is similar but more complicated — see (601 for C, 
while 

Q UF {a{z),b{z))= l -a(z)b{z). 

As seen in the previous section, these C transforms involve only integration, 
differentiation, and ordinary and Hadamard products — all are operations that 
preserve the character of being A-regular and admitting complete asymptotic 
expansions at the dominant singularity. We then have a general result: 

Theorem 16. For any of the binary search tree, uniform binary tree, or union- 
find model, and for any integer s > 0, the sth moment of the cost function 
associated to a toll t l ° s or t" admits a complete descending expansion in powers 
of n (possibly with logarithmic terms). 

Proof. The proof is simply an induction on the order s of the moments. We es- 
tablish by induction the stronger property that the generating functions /i^(z) 
are A-regular and admit complete asymptotic expansions in powers of (1 — z), 
possibly with logarithmic terms, after rescaling the singularity to be at 1. The 
property is true for s — 1 by results of the previous section. If the property 
is assumed to be true through order s — 1, then the tolls A s '(z) are A-regular 
and admit of complete asymptotic expansions at their singularity: this results 
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from closure theorems of Sections [3] and 31 Next, the ^-transform is applied 
and, again by closure theorems, the property of r^{z) is seen to extend to 
^ s \z). Thus the singular structure of fi^(z) is fully characterized. It then 
suffices to apply basic singularity analysis in order to recover the existence of 
full asymptotic expansion of the moments = -^-[z n ]^ s \z) . □ 

The process of extracting moments one after the other has been nicknamed 
"moment pumping" in the article [25] , where it was used to determine the shape 
of the moments of total displacement in linear hashing tables. It had been cm- 
ployed earlier by Louchard and Takacs in order to characterize moments of path 
length in trees and of area under excursions (451 |61j , in a way largely similar 
to what has been described here in more general terms. In favorable cases, a 
pattern regarding the asymptotic shape of moments may emerge. In such cases 
(possibly centering of the random variable is required), the limiting distribution 
of costs becomes accessible through its moments, thanks to the moment con- 
vergence theorem. Instances are found in the already cited papers [25, 45, 61]. 
Fill's paper [16] provides another example (although it is based on direct re- 
currence manipulations rather than generating functions) to the effect that the 
logarithmic toll t l ° g gives rise to asymptotically Gaussian costs under the bi- 
nary search tree model. Yet other examples, often based on direct recurrence 
manipulations, are provided by the recent independent studies of Hwang and 
Neininger [37] and of Fill and Kapur pjj] . Clearly, a "metatheorem" similar to 
Theorem [16] is possible for varieties of increasing trees in the sense of Bergeron- 
Flajolet-Salvy [2] (generalizing the BST model). For simply generated families 
of trees in the sense of Meir and Moon [49] (generalizing the Catalan model), 
asymptotics of moments as well as limiting distributions have been derived by 
Fill and Kapur [17] as part of a broader project joint with Svante Janson. The 
union-find tree model can be generalized to other families of trees, and the tech- 
niques of the present paper can again be applied; this is the subject of ongoing 
research by the authors. 

6.2 Differential models 

Many tree recurrences associated to comparison-based searching and multidi- 
mensional retrieval problems generalizing binary search trees, once translated 
into generating functions, lead to integral equations of the form 

*[/](*)=*(*)> (67) 

where $ is a linear integral operator involving coefficients in C(z), that is, 
rational function coefficients. Here, as in our lead examples, f(z) is a generating 
function of expected costs and t(z) is a toll generating function. By successive 
differentiations, this transforms into a linear differential equation of the form 

A[f](z) = t(z), (68) 

where t(z) is a modified toll generating function and is an elementary variant 
of t(z). We shall let d denote the order of the differential equation (68}. 
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The description above corresponds to the situation already encountered with 
the binary search tree recurrence, representing the easy case of a differential 
order equal to 1. Other known cases include the m-ary search tree studied by 
Mahmoud and Pittel (see the account in [47] ) and others [6j [19], quicksort with 
median-of-sample partitioning and locally balanced trees [40, 58] , quadtrees [23, 
36] as well as multidimensional search trees also known as fc-d-trees [28] . (A 
valuable survey of a class of problems leading to Euler equations appears in [7].) 
For instance, in the case of 2-dimensional quadtrees the operator is given in [36] 
as 



*[/](*)=/(*) 



fix) 



d.r 



1 



dy 



y(i -y)'' 



which leads to a second order differential equation, 



2(1 - z)d 2 J(z) + (1 - 2z)dJ(z) - - f(z) = t(z), 



where t(z) = d z [z(l - z)t'(z)]. 

The variation-of-constants technique applies to equations of order greater 
than 1 as well as to linear systems. It may then be used to express / as a linear 
integral transform involving a set {hj} of solutions to the homogeneous equation 
Ah = 0, as we know explain. Indeed, let the linear differential equation (68} be 
put into the form of a system 



3 z y(z) = Ay(z)+b( Z ), 



(69) 



where y is the ci-dimensional vector y — (/,/',..., / *- d_1 '), A = A(z) is a dx d 
matrix of functions [here, by assumption, all in C(z)], and b = (t,t', . . . , ? <i_1 ^); 
see [331 v °l H> §9-3] for the reduction. Recall that a fundamental matrix W for 
the system (69) is by definition a nonsingular d x d matrix whose columns each 
satisfy the homogenous system d 2 y(z) = Ay(z). Then the general solution 
to the inhomogeneous system (69) is, by the classical "variation-of-constants" 
formula, 



y(z) = W(z) ■ W-^Zo) ■ y(z ) + W(z) ■ / W" 1 ^) • b(x) dx; 



(70) 



see once more [331 v °l H] • (The initial conditions at some Zq are assumed to be 
known.) This provides the solution to ( [67] ) as 

f(z)=£[tiz)}, 

with £ a linear integral transform that involves polynomially the elements of a 
fundamental matrix W as well as the inverse of the Wronskian detW. (The 
case of Euler equations is somewhat simpler, as it is fully explicit (7J EES]-) For 
instance, the case of 2-dimensional quadtrees leads to a still explicit form [36] , 
namely, /(z) = £[f (z)] where 



C[e(z)] 



l + 2z f* jl-y) 3 
(l-z) 2 i yil + 2yf 



l + 2x 



e(x) dx 



dy. 
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From here onward we suppose for simplicity that u) n = 1, so that / is an 
ordinary generating function, though our discussion extends readily to more 
general normalization constants. Call a system dominantly regular 9 if it is 
singular at 1 (i.e., if the matrix A has a pole at 1, but at no other point 
in \z\ < 1 except possibly 0) and if the pole of A at 1 is simple — the latter case 
is known as a singularity "of the first kind" . All the classical examples listed 
above and generalizing binary search trees satisfy this condition. We then have: 

Theorem 17. Let a tree recurrence be expressed by a differential system that 
is dominantly regular. Then the expectations of costs induced by the tolls t™ 
and t l ° s admit complete asymptotic expansions in descending powers of n, pos- 
sibly with logarithmic terms. 

Proof. First, we observe that for any tree recurrence, the cost induced by an 
eventually increasing nonnegative toll t n — > +oo is at least t n (by the very 
nature of the tree recurrence) and at most 0(nt n ) (by induction). Thus, for 
the tolls under consideration, the generating function of costs, f(z), has radius 
of convergence exactly equal to 1. We also observe that the values of / and its 
derivatives at some point zq such that \zq\ < 1 are well-defined. We may adopt 
for instance zq = h in the variation-of-constants formula. 

By the classical theory of singularities of the first kind, each of the column 
vectors of matrix W is analytic for z in a neighborhood of 1 slit along the 
ray [1, +oo). There, as z tends to 1, it admits a representation as a finite 
combination of terms of the form 



where a is an algebraic number (a root of the indicial equation), k an integer, 
and R is analytic at 0. Thus, each element of W is amenable to singularity 
analysis as it is A-continuable and admits a bona fide expansion near 1. 

By formula ( [70] ) , there remains to discuss the elements of W" 1 . By the 
cofactor rule, the elements of W _1 involve polynomially the elements of W 
divided by the Wronskian determinant detW(z). It is a well-known fact (see 
§9.3 of [ [331, vol II]) that the Wronskian is expressible in terms of the system 
alone and one has 



[Here tr(-) denotes the matrix trace operator.] By the dominant regularity 
assumption, the trace is here a rational function with at most a simple pole 
at 1, so that its integral is either analytic at 1 or logarithmic. In either case, 
(detW(z)) -1 is of singularity analysis type, and so are consequently all the 

9 The term "dominantly regular" evokes the fact that the condition concerns the dominant 
singularity of the solution function, where the singularity of the system is of the so-called 
"regular" type (first-kind implies regular singularity by a well-known theorem; see [33, vol II, 
Theorem 9.4d]). 



(1- z) a L(z) k R(l- z), 
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elements of the inverse of the fundamental matrix W _1 (z). By the singu- 
lar integration and singular differentiation theorems of Section [3J there results 
that the integral transform ( [70] ) preserves for functions the character of being 
amenable to singularity analysis. Since the toll generating functions are of sin- 
gularity analysis type, basic singularity analysis is applicable to f(z). The result 
follows. □ 

In principle, higher moments will also become accessible to singularity anal- 
ysis once the nonlinear integral forms Q extending Q BST of ( [66] ) have been 
worked out. We are however not aware of existing research in this direction, 
despite the fact that the splitting probabilities are known in a number of cases 
(see, e.g., [23] for quadtrees). There is interest in these questions, as partly 
heuristic recent work by Majumdar and collaborators (see, e.g., [12] for the 
type of method employed and succinct developments) indicates the probable 
existence of phase transitions in the number of internal nodes of <i-dimensional 
quadtrees for large enough d (d > d c = 9 is suggested) in a way similar to what 
is already well established for the size of m-ary search trees [6[ \T9\ [47] ■ 

As a final note, we'd like to mention digital trees, which were recognized 
to be amenable to treatment by ordinary (rather than the more customary 
exponential) generating functions in [27] . Techniques of the present paper would 
most likely be usable in such a context, in particular as regards tolls of the 
form n a and logn. A partial classification of cost functions along these lines 
has already been given by Derfel and Vogl in [13] . 
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